Decision Tree Grafting From the All Tests But One Partition

نویسنده

  • Geoffrey I. Webb
چکیده

Decision tree grafting adds nodes to an existing decision tree with the objective of reducing prediction error. A new grafting algorithm is presented that considers one set of training data only for each leaf of the initial decision tree, the set of cases that fail at most one test on the path to the leaf. This new technique is demonstrated to retain the error reduction power of the original grafting algorithm while dramatically reducing compute time and the complexity of the inferred tree. Bias/variance analyses reveal that the original grafting technique operated primarily by variance reduction while the new technique reduces both bias and variance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Separating Decision Tree Complexity from Subcube Partition Complexity

The subcube partition model of computation is at least as powerful as decision trees but no separation between these models was known. We show that there exists a function whose deterministic subcube partition complexity is asymptotically smaller than its randomized decision tree complexity, resolving an open problem of Friedgut, Kahn, and Wigderson (2002). Our lower bound is based on the infor...

متن کامل

TREE AUTOMATA BASED ON COMPLETE RESIDUATED LATTICE-VALUED LOGIC: REDUCTION ALGORITHM AND DECISION PROBLEMS

In this paper, at first we define the concepts of response function and accessible states of a complete residuated lattice-valued (for simplicity we write $mathcal{L}$-valued) tree automaton with a threshold $c.$ Then, related to these concepts, we prove some lemmas and theorems that are applied in considering some decision problems such as finiteness-value and emptiness-value of recognizable t...

متن کامل

A Decision Tree for Technology Selection of Nitrogen Production Plants

Nitrogen is produced mainly from its most abundant source, the air, using three processes: membrane, pressure swing adsorption (PSA) and cryogenic. The most common method for evaluating a process is using the selection diagrams based on feasibility studies. Since the selection diagrams are presented by different companies, they are biased, and provide unsimilar and even controversial results. I...

متن کامل

مطالعات درخت تصمیم در برآورد ریسک ابتلا به سرطان سینه با استفاده از چند شکلی‌های تک نوکلوئیدی

Abstract Introduction:   Decision tree is the data mining tools to collect, accurate prediction and sift information from massive amounts of data that are used widely in the field of computational biology and bioinformatics. In bioinformatics can be predict on diseases, including breast cancer. The use of genomic data including single nucleotide polymorphisms is a very important ...

متن کامل

Determining Factors Influencing Length of Stay and Predicting Length of Stay Using Data Mining in the General Surgery Department

Background: Length of stay is one of the most important indicators in assessing hospital performance. A shorter stay can reduce the costs per discharge and shift care from inpatient to less expensive post-acute settings. It can lead to a greater readmission rate, better resource management, and more efficient services. Objective: This study aimed to ident...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999